TakeLab: Systems for Measuring Semantic Text Similarity

نویسندگان

  • Frane Saric
  • Goran Glavas
  • Mladen Karan
  • Jan Snajder
  • Bojana Dalbelo Basic
چکیده

This paper describes the two systems for determining the semantic similarity of short texts submitted to the SemEval 2012 Task 6. Most of the research on semantic similarity of textual content focuses on large documents. However, a fair amount of information is condensed into short text snippets such as social media posts, image captions, and scientific abstracts. We predict the human ratings of sentence similarity using a support vector regression model with multiple features measuring word-overlap similarity and syntax similarity. Out of 89 systems submitted, our two systems ranked in the top 5, for the three overall evaluation metrics used (overall Pearson – 2nd and 3rd, normalized Pearson – 1st and 3rd, weighted mean – 2nd and 5th).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring Conceptual Distance Using WordNet: The Design of a Metric for Measuring Semantic Similarity*

This paper describes the development of a metric for measuring the semantic distance or similarity of words using the WordNet lexical database. Such a metric could be of use in development of search engines and text retrieval systems, tasks for which the richness of natural language can cause difficulty. Further, such a metric can prove invaluable to psycholinguists who wish to study lexical se...

متن کامل

OPI-JSA at SemEval-2017 Task 1: Application of Ensemble learning for computing semantic textual similarity

Semantic Textual Similarity (STS) evaluation assesses the degree to which two parts of texts are similar, based on their semantic evaluation. In this paper, we describe three models submitted to STS SemEval 2017. Given two English parts of a text, each of proposed methods outputs the assessment of their semantic similarity. We propose an approach for computing monolingual semantic textual simil...

متن کامل

TAKELAB: Medical Information Extraction and Linking with MINERAL

Medical texts are filled with mentions of diseases, disorders, and other clinical conditions, with many different surface forms relating to the same condition. We describe MINERAL, a system for extraction and normalization of disease mentions in clinical text, with which we participated in the Task 14 of SemEval 2015 evaluation campaign. MINERAL relies on a conditional random fields-based model...

متن کامل

NTNU-CORE: Combining strong features for semantic similarity

The paper outlines the work carried out at NTNU as part of the *SEM’13 shared task on Semantic Textual Similarity, using an approach which combines shallow textual, distributional and knowledge-based features by a support vector regression model. Feature sets include (1) aggregated similarity based on named entity recognition with WordNet and Levenshtein distance through the calculation of maxi...

متن کامل

Measuring The Semantic Similarity Of Texts

This paper presents a knowledge-based method for measuring the semanticsimilarity of texts. While there is a large body of previous work focused on finding the semantic similarity of concepts and words, the application of these wordoriented methods to text similarity has not been yet explored. In this paper, we introduce a method that combines wordto-word similarity metrics into a text-totext m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012